Motivation

Never before in history, have there been so many people on Earth as right now. The number boosted in the years, from around 1 billion in the year 1800, to 7.5 billions in 2017.

Estimates of the population amount at earlier times have been done too: at the time agriculture emerged in around 10000 Before Christ, estimates of the world population ranged between 1 million and 15 million. Even earlier - about 70000 years ago - studies supports that humans may have gone through bottleneck of 1000 - 10000 people according to the thory of the Toba supervulcanic eruption.

Given the population growth of the last century, what should we expect for the next one? Will this lead to major changes in our lifestyle, or will this lead to wars, poverty problems, lack of primary resources and so on?
Or maybe all those are just unwarrant fears and everything is going to fix itself?

Demographic Transition

What happens when a poor country starts to walks throught welfare and moves to an industrialized economic system? The birth rate, which is usually high in a poor country, will no longer be compensated by the high death rate, and the population starts to grow. Then at a certain point, the fear about overpopulation starts to rise.< br> In 1929 the American demographer Warren Thompson developed the theory of the Demogrephic Transition[3], whereby happens a transition from high birth and death rates to lower birth and death rates.

This theory can involve four to five stages of transition of the trend of population growth. Here’s a summary of the five steps:

Let’s have a look at the situation in Italy, for example: in the graphs below you can see the trend of births, deaths, and the total population from 1960 to 2016.


To have a better look at the situation I searched for a dataset which included also some previous years: here below it is shown the trend od total people starting from 1700.


The green shaded are of the second graph includes the same area of the green-line graph above.

Italy is currently in Stage 4 of the Demographic Transition Model: as we can see from the above graphics, we are having low birth rates and low death rates; moreover the Population Growth Rate (PGR) is low, causing the stabilization of the people amount.

\[ PGR = \frac{P(t_2) - P(t_1)}{P(t_1)(t_2 - t_1)} \]

A positive outcome of the PGR indicates that the population is increasing, while a negative one indicates the decreasing of it. Moreover, a zero result means that the quantity has not changed in the selected amount of time.
Let’s compute it on the data used above here, using as time interval the years.

As we can see from the plot, the PGR values are quite low, and in the last years (2015 - 2016) starts also to become negative. This is a good proof that Italy’s population is starting to diminish, and so that Italy is currently standing into Stage Four of the Demographic Transiiton.

Let’s now have a look the the World’s situation. Here below you can see a leaflet representation of the distribution of people in the whole world:





Gross domestic product

The GDP is defined as “an aggregate measure of production equal to the sum of the gross values added of all resident and institutional units engaged in production (plus any taxes, and minus any subsidies, on products not included in the value of their outputs).” And is considered the “world’s most powerful statistical indicator of national development and progress”.

Here I computed the Percentage of growth for every year, in the whole world:

\[ \frac{P_{t}-P{t_0}}{P_{t}} \cdot 100 \] Where \(P_{t}\) represents the population in a certain year and \(P_{t_0}\) the population at the preceeding year.

With the image above we can have a sight at the distribution of the people around the Continents and the different Regions.

What are the Prospects?

Sources and Analysis

Sources Description

For this study I joined various dataset. I started from the dataset “Countries of the world” that you can find on Kaggle at https://www.kaggle.com/fernandol/countries-of-the-world, which contains free Data from the World Factbook.

I decided to move forward my analysis combining this datasets to some additive ones I found on the World Bank Open Data at https://data.worldbank.org/, where I decided ti download the collections of data regarding population amount, birthrate and deathrate (both over 1000 people) and the Gross Domestic Product of a Country; those datasets contain measures from 1960 to 2016 for (quite) every Country in the world, but they show missing data sometimes.
To analyze the situation in Italy also in earlier years (1700 - 1960) I added the data found here: http://www.populstat.info/Europe/italyc.htm
Lastly, to have an estimate of the world’s population from year one AD, I took data also from here: www.ecology.com/population-estimates-year-2050/

Data Description

I made my analysis my means of different R packages and utilities: dplyr, leaflet, ggplot2, tidyr are the main names, but I used also geojsonio, rworldmap and countrycode for my parsing and leaflet plots; htmlwidget and htmltools to save and plot some interactive maps.

Let’s take a look first of all to the datasets World Bank Open Data: when you decide to download an indicator, you end up with three files: the fist one is the real dataset, which, for example for the “Total Population” indicator, contains the following columns:

  • “Country.Name”
  • “Country.Code”
  • “Indicator.Name”
  • “Indicator.Code”
  • years from 1960 to 2017 in the form “X1960”

Indicator Name and Code are quite useless for our scope, beacuse they are only a skimpy description of the table content. Country name and code are, on the other end, essential: each row of the table contains all the data for one single Country, for that indicator, for the 1960 - 2017 time frame. In the columns with the years as names, are then contained the actual data.

The other two tables of the dataset contain some Metadata:

  • In the first one we found a source note, which better describes the indicator: for the “Total Population” indicator we found : Total population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship. The values shown are midyear estimates.
  • In the second one we foundmore precise descriptions for each Country. these are data that I have not used, though.

Data Analysis

I started with some data-parsing: first of all I modified the year’s columns to eliminate that ‘X’ in fromt of each year. I decided then to add to each Country the “Continent” and “Region” indicators (the last one indicates better in which part of the Continent the Country is located), to chech the amount of peiople and their distribution as a dependence of them. To do this I had to modify the dataset adding some column by means of dlpyr and the mutate command. This analysis can be found into the “world.R” script file.

I decided then to check the amount of population data of different years, to verify if the outcome values were consistent with the well-known effective numbers: I used again dplyr’s commands to select and count data, doing something like:

total_population %>%
    select(Country, `1960`) %>%
    filter(Country=="World")

As you can see, I did not had to count because also the “World” data are a row of our dataset.
I had to be careful, so: apart from “World” also other non-Country data were inserted as rows into the dataset. Indeed, when I did my first tests, things didn’t add up!

References

[1] https://data.worldbank.org/
[2] https://www.ecology.com/population-estimates-year-2050/ (early ages)
[3] https://en.wikipedia.org/wiki/Demographic_transition